Goto

Collaborating Authors

 Newark




TRACE: A Self-Improving Framework for Robot Behavior Forecasting with Vision-Language Models

arXiv.org Artificial Intelligence

Predicting the near-term behavior of a reactive agent is crucial in many robotic scenarios, yet remains challenging when observations of that agent are sparse or intermittent. Vision-Language Models (VLMs) offer a promising avenue by integrating textual domain knowledge with visual cues, but their one-shot predictions often miss important edge cases and unusual maneuvers. Our key insight is that iterative, counterfactual exploration--where a dedicated module probes each proposed behavior hypothesis, explicitly represented as a plausible trajectory, for overlooked possibilities--can significantly enhance VLM-based behavioral forecasting. We present TRACE (Tree-of-thought Reasoning And Counterfactual Exploration), an inference framework that couples tree-of-thought generation with domain-aware feedback to refine behavior hypotheses over multiple rounds. Concretely, a VLM first proposes candidate trajectories for the agent; a counterfactual critic then suggests edge-case variations consistent with partial observations, prompting the VLM to expand or adjust its hypotheses in the next iteration. This creates a self-improving cycle where the VLM progressively internalizes edge cases from previous rounds, systematically uncovering not only typical behaviors but also rare or borderline maneuvers, ultimately yielding more robust trajectory predictions from minimal sensor data. We validate TRACE on both ground-vehicle simulations and real-world marine autonomous surface vehicles. Experimental results show that our method consistently outperforms standard VLM-driven and purely model-based baselines, capturing a broader range of feasible agent behaviors despite sparse sensing. Evaluation videos and code are available at trace-robotics.github.io.


WiiM Amp Pro Review: Name a Better Network Amp, We'll Wait

WIRED

From a quiet corner of Linkplay Technologies headquarters in Newark, California, WiiM has rapidly become one of the real forces in affordable network audio streaming. If you foresaw a brand that's not even four years old picking up the slack left in the wake of Sonos' self-immolation last year, congratulations--your powers of prescience are considerably better than mine. This Amp Pro is the company's latest demonstration of its entry-level prowess. A mere 379 buys a compact (2.6 x 7.5 x 8.5in, HxWxD), tidily constructed aluminum box that's equipped to power a single pair of passive loudspeakers and provide a gateway to network music streaming. It's ready to become part of a multiroom and/or smart home system in conjunction with Amazon Echo, Google Nest, Linkplay and WiiM devices.


GRUvader: Sentiment-Informed Stock Market Prediction

arXiv.org Artificial Intelligence

Stock price prediction is challenging due to global economic instability, high volatility, and the complexity of financial markets. Hence, this study compared several machine learning algorithms for stock market prediction and further examined the influence of a sentiment analysis indicator on the prediction of stock prices. Our results were two-fold. Firstly, we used a lexicon-based sentiment analysis approach to identify sentiment features, thus evidencing the correlation between the sentiment indicator and stock price movement. Secondly, we proposed the use of GRUvader, an optimal gated recurrent unit network, for stock market prediction. Our findings suggest that stand-alone models struggled compared with AI-enhanced models. Thus, our paper makes further recommendations on latter systems.


Optimizing Automated Picking Systems in Warehouse Robots Using Machine Learning

arXiv.org Artificial Intelligence

With the rapid growth of global e-commerce, the demand for automation in the logistics industry is increasing. This study focuses on automated picking systems in warehouses, utilizing deep learning and reinforcement learning technologies to enhance picking efficiency and accuracy while reducing system failure rates. Through empirical analysis, we demonstrate the effectiveness of these technologies in improving robot picking performance and adaptability to complex environments. The results show that the integrated machine learning model significantly outperforms traditional methods, effectively addressing the challenges of peak order processing, reducing operational errors, and improving overall logistics efficiency. Additionally, by analyzing environmental factors, this study further optimizes system design to ensure efficient and stable operation under variable conditions. This research not only provides innovative solutions for logistics automation but also offers a theoretical and empirical foundation for future technological development and application.


ScenEval: A Benchmark for Scenario-Based Evaluation of Code Generation

arXiv.org Artificial Intelligence

In the scenario-based evaluation of machine learning models, a key problem is how to construct test datasets that represent various scenarios. The methodology proposed in this paper is to construct a benchmark and attach metadata to each test case. Then a test system can be constructed with test morphisms that filter the test cases based on metadata to form a dataset. The paper demonstrates this methodology with large language models for code generation. A benchmark called ScenEval is constructed from problems in textbooks, an online tutorial website and Stack Overflow. Filtering by scenario is demonstrated and the test sets are used to evaluate ChatGPT for Java code generation. Our experiments found that the performance of ChatGPT decreases with the complexity of the coding task. It is weakest for advanced topics like multi-threading, data structure algorithms and recursive methods. The Java code generated by ChatGPT tends to be much shorter than reference solution in terms of number of lines, while it is more likely to be more complex in both cyclomatic and cognitive complexity metrics, if the generated code is correct. However, the generated code is more likely to be less complex than the reference solution if the code is incorrect.


Uncertainty Measurement of Deep Learning System based on the Convex Hull of Training Sets

arXiv.org Artificial Intelligence

Deep Learning (DL) has made remarkable achievements in computer vision and adopted in safety critical domains such as medical imaging or autonomous drive. Thus, it is necessary to understand the uncertainty of the model to effectively reduce accidents and losses due to misjudgment of the Deep Neural Networks (DNN). This can start by efficiently selecting data that could potentially malfunction to the model. Traditionally, data collection and labeling have been done manually, but recently test data selection methods have emerged that focus on capturing samples that are not relevant to what the model had been learned. They're selected based on the activation pattern of neurons in DNN, entropy minimization based on softmax output of the DL. However, these methods cannot quantitatively analyze the extent to which unseen samples are extrapolated from the training data. Therefore, we propose To-hull Uncertainty and Closure Ratio, which measures an uncertainty of trained model based on the convex hull of training data. It can observe the positional relation between the convex hull of the learned data and an unseen sample and infer how extrapolate the sample is from the convex hull. To evaluate the proposed method, we conduct empirical studies on popular datasets and DNN models, compared to state-of-the art test selection metrics. As a result of the experiment, the proposed To-hull Uncertainty is effective in finding samples with unusual patterns (e.g. adversarial attack) compared to the existing test selection metric.


A former Google engineer was arrested for allegedly stealing AI secrets for Chinese rivals

Engadget

A former Google engineer was arrested in California on Wednesday for stealing more than 500 files containing artificial intelligence trade secrets from the company and using the information to benefit rival tech companies in China. In an indictment that was unsealed in a federal California court, prosecutors accused Linwei Ding, a 38-year-old Chinese national who started working at Google in 2019, of uploading trade secrets from his Google-issued laptop to personal cloud storage accounts. The documents that Ding stole involved "building blocks" of Google's AI infrastructure, according to the indictment. Ding was arrested in Newark, California, and charged with four counts of theft of trade secrets. If convicted, he can be sentenced up to 10 years in prison and a fine of up to 250,000 for each count.


Ex-Google engineer arrested for alleged theft of AI secrets for Chinese firms

The Guardian

A Chinese software engineer has been arrested for allegedly stealing artificial intelligence technology from Google while secretly working for two Chinese companies. Linwei Ding, 38, also known as Leon Ding, faces four counts of theft of trade secrets, the US attorney general, Merrick Garland, said in a statement. Ding, who was arrested on Wednesday in Newark, California, allegedly transferred confidential information from Google's network to his personal account while secretly affiliated with Chinese-based companies in the AI industry. "The justice department will not tolerate the theft of artificial intelligence and other advanced technologies that could put our national security at risk," Garland said. "We will fiercely protect sensitive technologies developed in America from falling into the hands of those who should not have them."